7. A brief conclusions

Predicting the tips in the NYC taxis with an accuracy of 71.74% using random forests. Cool! Especially considering the next issues:

  • The initial status of the data.
  • There is no information to use about the passangers, a essential data for trying to accomplish this task.
  • It's my first time with a real case.

Think what you could do with this dataset adding more information sources like:

  • Weather
  • Subway and bus
  • Plane and train arrivals and departures
  • Sports and concerts
  • ... whatever you can imagine

You could solve a huge range of problems!

So, feel completely free to fork the repository and play with the notebooks. And if you have any problem, please contact me at my GitHub or Twitter accounts.

Finally, I'd like to say thank you again for the data Chris. And of course, thank you too Fernando for supervising all my work.